Chinese Syntactic Reordering for Adequate Generation of Korean Verbal Phrases in Chinese-to-Korean SMT
نویسندگان
چکیده
Chinese and Korean belong to different language families in terms of word-order and morphological typology. Chinese is an SVO and morphologically poor language while Korean is an SOV and morphologically rich one. In Chinese-to-Korean SMT systems, systematic differences between the verbal systems of the two languages make the generation of Korean verbal phrases difficult. To resolve the difficulties, we address two issues in this paper. The first issue is that the verb position is different from the viewpoint of word-order typology. The second is the difficulty of complex morphology generation of Korean verbs from the viewpoint of morphological typology. We propose a Chinese syntactic reordering that is better at generating Korean verbal phrases in Chinese-to-Korean SMT. Specifically, we consider reordering rules targeting Chinese verb phrases (VPs), preposition phrases (PPs), and modality-bearing words that are closely related to Korean verbal phrases. We verify our system with two corpora of different domains. Our proposed approach significantly improves the performance of our system over a baseline phrased-based SMT system. The relative improvements in the two corpora are +9.32% and +5.43%, respectively.
منابع مشابه
Chinese Syntactic Reordering through Contrastive Analysis of Predicate-predicate Patterns in Chinese-to-Korean SMT
We propose a Chinese dependency tree reordering method for Chinese-to-Korean SMT systems through analyzing systematic differences between the Chinese and Korean languages. Translating predicate-predicate patterns in Chinese into Korean raises various issues such as long-distance reordering. This paper concentrates on syntactic reordering of predicate-predicate patterns in Chinese dependency tre...
متن کاملTransferring Syntactic Relations of Subject-Verb-Object Pattern in Chinese-to-Korean SMT
Since most Korean postpositions signal grammatical functions such as syntactic relations, generation of incorrect Korean postpositions results in producing ungrammatical outputs in machine translations targeting Korean. Chinese and Korean belong to morphosyntactically divergent language pairs, and usually Korean postpositions do not have their counterparts in Chinese. In this paper, we propose ...
متن کاملA Korean-Japanese-Chinese Aligned Wordnet with Shared Semantic Hierarchy
A Korean-Japanese-Chinese aligned wordnet, “CoreNet” is introduced. For the purpose of this paper, the term “wordnet” refers to a network of words. It is constructed based on a shared semantic hierarchy that is originated from NTT Goidaikei (Lexical Hierarchical System). Korean wordnet was constructed through the semantic category assignment to every meaning of Korean words in a dictionary. Ver...
متن کاملAnnotation Guidelines for Chinese-Korean Word Alignment
For a language pair such as Chinese and Korean that belong to entirely different language families in terms of typology and genealogy, finding the correspondences is quite obscure in word alignment. We present annotation guidelines for Chinese-Korean word alignment through contrastive analysis of morpho-syntactic encodings. We discuss the differences in verbal systems that cause most of linking...
متن کاملChinese Syntactic Reordering for Statistical Machine Translation
Syntactic reordering approaches are an effective method for handling word-order differences between source and target languages in statistical machine translation (SMT) systems. This paper introduces a reordering approach for translation from Chinese to English. We describe a set of syntactic reordering rules that exploit systematic differences between Chinese and English word order. The result...
متن کامل